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Database management systems will continue to manage large data volumes. Thus, efficient 
algorithms for accessing and manipulating large sets and sequences will be required to 
provide acceptable performance. The advent of object-oriented and extensible database 
systems will not solve this problem. On the contrary, modern data models exacerbate the 
problem: In order to manipulate large sets of complex objects as efficiently as today's 
database systems manipulate simple records, query-processi ... 

Keywords: complex query evaluation plans, dynamic query evaluation plans, extensible 
database systems, iterators, object-oriented database systems, operator model of 
parallellzation, parallel algorithms, relational database systems, set-matching algorithms, 
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Full text available: g pdf(927.23 KB) Additional Information: full citation , abstract , references , index terms 

As Internet traffic continues to grow and websites become increasingly complex, 
performance and scalability are major issues for websites. Websites are increasingly relying 
on dynamic content generation applications to provide website visitors with dynamic, 
interactive, and personalized experiences. However, dynamic content generation comes at a 
cost— each request requires computation as well as communication across multiple 
components.To address these issues, various dynamic content caching ap ... 

Keywords: Edge caching, caching dynamically generated content, fragment caching, 
implementation, proxy caching, world wide web 
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Understanding distributed applications is a tedious and difficult task. Visualizations based on 
process-time diagrams are often used to obtain a better understanding of the execution of 
the application. The visualization tool we use is Poet, an event tracer developed at the 
University of Waterloo. However, these diagrams are often very complex and do not provide 
the user with the desired overview of the application. In our experience, such tools display 
repeated occurrences of non-trivial commun ... 

* An XML query engine for network-bound data 
Zachary G. Ives, A. Y. Halevy, D. S. Weld 

December 2002 The VLDB Journal — The International Journal on Very Large Data 

Bases, volume 11 Issue 4 
Full text available: ' ^pdf(351.86 KB) Additional Information: full citation , abstract , index terms 

XML has become the lingua franca for data exchange and integration across administrative 
and enterprise boundaries. Nearly all data providers are adding XML import or export 
capabilities, and standard XML Schemas and DTDs are being promoted for all types of data 
sharing. The ubiquity of XML has removed one of the major obstacles to integrating data 
from widely disparate sources - namely, the heterogeneity of data formats. However, 
general-purpose integration of data across the wide are a also re ... 

Keywords: Data integration, Data streams, Query processing, Web and databases, XML 




IS '97: model curriculum and guidelines for undergraduate degree programs in 
information systems 

Gordon B. Davis, John T. Gorgone, J. Daniel Couger, David L. Feinstein, Herbert E. 
Longenecker 

December 1997 ACM SIGMIS Database , Guidelines for undergraduate degree programs 
on Model curriculum and guidelines for undergraduate degree 
programs in information systems, volume 28 issue i 

Full text available: ^ pdf(7.24 MB) Additional Information: full citation , citings 



Using LDAP directory caches 
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May 1999 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on 
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June 2002 Proceedings of the 2002 ACM SIGMOD international conference on 
Management of data 

Full text available* 'P !) pdf(137 MB) Additional Information: full citation , abstract , references , dtinqs . index 

terms 

As Internet traffic continues to grow and web sites become increasingly complex, 
performance and scalability are major issues for web sites. Web sites are increasingly 
relying on dynamic content generation applications to provide web site visitors with 
dynamic, interactive, and personalized experiences. However, dynamic content generation 
comes at a cost — each request requires computation as well as communication across 
multiple components.To address these issues, various dynamic content each ... 
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^ The berkelev UNIX consultant project 

Robert Wilensky, David N. Chin, Marc Luria, James Martin, James Mayfield, Dekai Wu 
December 1988 C mputati nal Linguistics, Volume 14 issue 4 
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UC (UNIX Consultant) is an intelligent, natural language Interface that allows naive users to 
learn about the UNIX^ operating system. UC was undertaken because the task was thought 
to be both a fertile domain for artificial intelligence (AI) research and a useful application of 
AI work in planning, reasoning, natural language processing, and knowledge 
representation.The current implementation of UC comprises the following components: a 
language analyzer, called ALANA, produces a repre ... 

^ Spoken dialogue technology: enabling the conversational user interface 
March 2002 ACM Computing Surveys (CSUR), Volume 34 issue i 

Full text available- "PI pdf(987 69 KB) Additional Information: full citation , abstract , references , dtinqs . index 

terms , review 

Spoken dialogue systems allow users to interact with computer-based applications such as 
databases and expert systems by using natural spoken language. The origins of spoken 
dialogue systems can be traced back to Artificial Intelligence research in the 1950s 
concerned with developing conversational interfaces. However, it is only within the last 
decade or so, with major advances in speech technology, that large-scale working systems 
have been developed and, in some cases, introduced into commerc ... 

Keywords: Dialogue management, human computer interaction, language generation, 
language understanding, speech recognition, speech synthesis 



The envoy framework: an open architecture for agents 

Murugappan Palaniappan, Nicole Yankelovich, George Fitzmaurice, Anne Loomis, Bernard 
Haan, James Coombs, Norman Meyrowitz 

July 1992 ACM Transactions on Information Systems (TOIS), volume lo issue 3 

Full text available' W[ pdf(2.47 MB) Additional Information: full citation , abstract , references , dtinqs , index 
^ terms 

The Envoy Framework addresses a need for computer-based assistants or agents that 
operate in conjunction with users' existing applications, helping them perform tedious, 
repetitive, or time-consuming tasks more easily and efficiently. Envoys carry out missions 
for users by invoking envoy-aware applications called operatives and inform users of mission 
results via envoy-aware applications called informers. The distributed, open architecture 
developed for Envoys is derived from an analysis of ... 

Keywords: application programmer interface, user agent 




The Purdue University network-computing hubs: running unmodified simulation tools H 
via the WWW 

Nirav H. Kapadia, Jose A. B. Fortes, Mark S. Lundstrom 

January 2000 ACM Transacti ns n Modeling and Computer Simulati n (TOMACS), 

Volume 10 Issue 1 

Full text available: pdfd 10.49 KB) Additional Information: full dtation . abstract , references , dtinqs , index 
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This paper describes the Web interface management infrastructure of a functioning network- 
computing system (PUNCH) that allows users to run unmodified simulation packages at 
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geographically dispersed sites. Tlie system currently contains more tlian fifty university and 
commercial simulation tools, and has been used to carry out more than two hundred 
thousand simulations via the World Wide Web. Dynamically-constructed virtual URLs allow 
the Web interface management infrastructure to support the ... 

Keyw rds: Internet computing, network-computing, web-based simulation 



^2 Parallel execution of prolog programs: a survey 

Gopal Gupta, Enrico Pontelli, Khayri A.M. Ali, Mats Carlsson, Manuel V. Hermenegildo 
July 2001 ACM Transactions on Programming Languages and Systems (TOPLAS), 

Volume 23 Issue 4 

Full text available: 'p Ipdfd.QS MB) Additional Information: full citation , abstract , references , citings , index 
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Since the early days of logic programming, researchers in the field realized the potential for 
exploitation of parallelism present in the execution of logic programs. Their high-level 
nature, the presence of nondeterminism, and their referential transparency, among other 
characteristics, make logic programs interesting candidates for obtaining speedups through 
parallel execution. At the same time, the fact that the typical applications of logic 
programming frequently involve irregular computatio ... 

Keywords: Automatic parallelization, constraint programming, logic programming, 
parallelism, prolog 
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June 2001 ACM Computing Surveys (CSUR), volume 33 issue 2 

Full text available' ' Pl pdf(828 46 KB) Additional Information: full citation , abstract , references , citings , index 

terms 

Data sets in large applications are often too massive to fit completely Inside the computers 
internal memory. The resulting input/output communication (or I/O) between fast internal 
memory and slower external memory (such as disks) can be a major performance 
bottleneck. In this article we survey the state of the art in the design and analysis of 
external memory (or EM) algorithms and data structures, where the goal is to exploit locality 
in order to reduce the I/O costs. We consider a varie ... 

Keywords: B-tree, I/O, batched, block, disk, dynamic, extendible hashing, external 
memory, hierarchical memory, multidimensional access methods, multilevel memory, 
online, out-of-core, secondary storage, sorting 
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Investigating link service infrastructures 
David C. De Roure, Nigel G. Walker, Leslie A. Carr 
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Keyw rds: LDAP, Whois++, directory services, distributed link service, link service, open 
hypermedia, query routing 



"•^ Form management 
D. Tslchritzis 

July 1982 C mmunicati ns f the ACM, Volume 25 issue 7 

Full text available- ^ pdf(2.78 MB) Additional Information: full citation , abstract , references , dtinas . index 
^ terms 

This paper consists of three Interrelated parts. In the first part forms are intoduced as an 
abstraction and generalization of business paper forms. A set of facilities for the 
manipulation of forms and their contents is outlined. Forms can be created, stored, found, 
viewed in different media, mailed, and located by office workers. Data on forms can also be 
processed in a completely integrated way. The facilities are discussed both abstractly and in 
relation to a prototype ... 

Keywords: database management, office modeling, office procedures 

17 Applying an information gathering architecture to Netfind: a white pages tool for a 
changing and growing Internet 
Michael F. Schwartz, Calton Pu 

October 1994 IEEE/ ACM Transactions on Networking (TON), volume 2 issue 5 
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1^ Nomenclator descriptive guerv optimization for large X.500 environments 
Joann J. Ordille, Barton P. Miller 

August 1991 ACM SIGCOMM Computer Communication Review , Proceedings of the 

conference on Communications architecture & protocols, volume 21 issue 4 
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19 Technical Session: Supporting ubiguitous computing through directory enabled 
technologies 
Michael Richichi, Paul Coen 

October 2001 Proceedings of the 29th annual ACM SIGUCCS conference on User 
services 

Full text available: Wi pdf(285.27 KB) Additional Information: full citation, abstract , references , dtings , index 

terms 

Drew has been providing computers to students since 1984. Many universities have 
ubiquitous computing programs where students receive a laptop computer as part of their 
educational package. These programs reduce the dependence on and management issues of 
traditional computer labs, and allow 24x7 computing access to every student at the 
University. Drew also provides Novell Directory Services (NDS) accounts to all of these 
students, and utilizes Novell ZENworks to customize software, personalize ... 

Keywords: LDAP, ZENworks, directory services, eDIrectory, laptop programs, management, 
ubiquitous computing 

Interactive Editing Systems: Part II | 
Norman Meyrowitz, Andries van Dam 

September 1982 ACM C mputing Surveys (CSUR), volume 14 issue 3 
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A system and method are provided for sharing and caching information in a data processing 
system and for efficiently managing a cacheable state shared among processes and clones. In one 
aspect, a method for managing a plurality of caches distributed in a network comprises 
maintaining, by each cache, a plurality of statistics associated with a cacheable object, 
wherein the statistics associated with the cacheable object comprise an access frequency (A 
(o)), an update frequency (U(o)); an update cost (C(o)), and a cost to fetch the cacheable 
object from remote source (F(o)); computing, by each cache, a metric using said statistics, 
wherein the metric quantitatively assesses the desirability of caching the cacheable object; 
and utilizing the metric, by each cache, to make caching decisions associated with the 
cacheable object. 
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DOCUMENT-IDENTIFIER: US 6760812 Bl 

TITLE: System and method for coordinating state between networked caches 
Brief Summary Text (7) : 

A "model" is a template for creating additional, nearly identical copies of a server or process 
instance, such as an application server or servlet engine. Such copies are called "clones". The 
act of creating clones is called cloning. A clone (or cloned process) is a special case of a 
process. Such processes and clones comprise many computer systems. Cloning allows multiple 
copies of the same object to behave together as if they were a single image, with the idea that 
clients experience improved performance. More specifically, processes and clones often perform 
particular tasks and communicate with other process and clones performing the same or other 
tasks. There are various benefits associated with having separate processes and clones perform 
individual tasks, including but not limited to reusability, understandability, and efficiency. 

Detailed Description Text (42): 

Next, cache retrieve requests (steps 530 and 540) are performed by the network cache manager to 
gather information necessary to initialize the local directory of the network cache manager. 
Typically, such requests are made to one or possibly more peer network cache managers, if any. 
If no other network cache manager is active, then no requests are made. The node, object, and 
dependency information that is returned to the network cache manager by the peer network cache 
manager (s) is placed appropriately into the associated network cache manager directory. This 
locally maintained information allows the network cache manager to independently determine 
where to send invalidation notifications and where particular cached objects exist. 
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DOCUMENT-IDENTIFIER: US 6665658 Bl 

TITLE: System and method for automatically gathering dynamic content and resources on the world 
wide web by stimulating user interaction and managing session information 

Detailed Description Text (21) : 

Referring to FIG. 2, the process of the present invention may be implemented as follows: 
Session manager 14 retrieves a URL (100) from the URL site list 30. Session manager 14 then 
retrieves the DTD information (102) for the retrieved URL from the Site Information database 
10, which is also passed to the Query Template Builder 16. Session manager 14 then passes the 
retrieved URL and DTD information to the Query Template Builder 16. Query Template Builder 16 
creates a query template (104) for the retrieved URL using the DTD information and passes the 
partial query template to the Query Template Manager 18. Query Manager 18 retrieves the topic 
to be searched (106) from the Search Topics database 12 and inserts the topic into the query 
template (108), which completes the query string. The fully completed query string is then 
passed to the Requester 20, which performs a HTTP request (110) to the URL site 24. Requester 
20 receives the results of the query from the URL site and passes the results (112) to the 
Search Results Manager 22. Typically, the results of a search will contain more than one 
result, and many times more than one page of results. Search Results Manager 22 knows from the 
DTD the page structure/schemata and is able to perform page navigation. If there is more than 
one page of results, the Search Results Manager 22 is capable of instructing the Requester to 
retrieve any additional pages of results (114) and can forward the query string back to the 
Requester 20. This cycle is continued until all of the results of the search are retrieved and 
the Search Results Manger has all of the search results. The retrieved search results or data 
are then passed to the Results Manager 26 for processing. Results Manager 26 can determine if 
there are additional topics to be searched (116) and Query Manager 18 can send additional query 
search strings to Requester 20 for further searches. This cycle of events is continued until 
all search topics have been searched. For example, a search of the site "AMAZON.COM" may 
include searching 15 different topics, in that site. After each search. Query Manager 26 can 
determine from the DTD that there are additional topics to be searched. It can cause additional 
search topic (s) to be retrieved from the Search Topics database 12 and cause a new search 
string to be created for each search topic. In this fashion. Query Manager 18 can cause 15 
different query strings to be created, each of which will produce a different set of search 
results. The search results are processed (118) by Results Manager 26, and may include 
notifying the Query Manger 18 that the search cycle is complete and that another search may 
proceed (120). Result Manager 26 may also store the search results, in for example, a data 
repository 28, and can also associate the search data with the DTD information and search topic 
categories. Results Manager 26 may also be able to extract^ analyze or summarize the search 
results and data. 

CLAIMS : 



1. An automated method of gathering dynamic content and resources on the world wide web by 
simulating user interaction and managing session information, the method comprising the steps 
of: providing a site database of dynamic websites requiring interaction to download contents 
thereof, said site database containing session data for the dynamic websites and document type 
definitions ("DTD") including descriptions of how to interact with the dynamic websites; 
identifying and retrieving at least one uniform resource locator ("URL") for a dynamic website 
to be analyzed; identifying and retrieving a session data and DTD for said URL from the site 
database; creating a query template for the retrieved URL using said identified DTD describing 
how to interact with the URL to simulate user interaction; identifying at least one search 
topic to be searched on said URL; inserting said at least one search topic into said query 
template to form a search query string querying said URL with said query string comprising said 
identified DTD and said at least one search topic; retrieving at least one result of said 
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query, thereby automatically simulating user interaction with said dynamic website to gather 
and extract said at least one result. 

5. An article of manufacture comprising: a site database of dynamic websites requiring 
interaction to download contents thereof, said site database containing session data for the 
dynamic websites and document type definitions ("DTD") including descriptions of how to 
interact with the dynamic websites; and a computer usable medium having computer readable 
program code means for automatically gathering dynamic content and resources on the world wide 
web by simulating user interaction and managing session information, the computer readable 
program code means in said article of manufacture comprising: computer readable program code 
means to identify and retrieve a URL for a dynamic website to be queried; computer readable 
program code means to identify and retrieve a session data and DTD for said URL from the site 
database; computer readable program code means to create a query template for the retrieved URL 
using said identified DTD describing how to interact with the URL to simulate user interaction; 
computer readable program code means to identify at least one search topic to be searched on 
said URL; computer readable program code means to insert said at least one search topic into 
said query template to form a search query string; computer readable program code means to 
query said URL with said query string comprising said identified DTD and said at least one 
search topic; computer readable program code means to retrieve at least one result of said 
query, thereby automatically simulating user interaction with said dynamic website to gather 
and extract said at least one result. 

9. A computer program product comprising: a site database of dynamic websites requiring 
interaction to, download contents thereof, said site database containing session data for the 
dynamic websites and document type definitions ("DTD") including descriptions of how to 
interact, with the dynamic websites; and a computer usable medium having computer readable 
program code means embodied in said medium for automatically gathering dynamic content and 
resources on the world wide web by simulating user interaction and managing session 
information, said computer program product having: computer readable program code means for 
causing a computer to identify and retrieve a URL for a dynamic website to be queried; computer 
readable program code means for causing a computer to identify and retrieve a session data and 
DTD for said URL from the site database; computer readable program code means to create a query 
template for the retrieved URL using said identified DTD describing how to interact with the 
URL to simulate user interaction; computer readable program code means for causing a computer 
to identify at least one search topic to be searched on said URL; computer readable program 
code means to insert said at least one search topic into said query template to form a search 
query string; computer readable program code means for causing a computer to query said URL 
with said query string comprising said identified DTD and said at least one search topic; 
computer readable program code means for causing a computer to retrieve at least one result of 
said query, thereby automatically simulating user interaction with said dynamic website to 
gather and extract said at least one result. 

13. A computer program product for automatically gathering dynamic content and resources on the 
world wide web, said computer program product comprising: a site database of dynamic websites 
requiring interaction to download contents thereof, said site database containing session data 
for the dynamic websites and document type definitions including descriptions of how to 
interact with the dynamic websites; and a computer usable medium having computer readable 
program code means embodied in said medium for causing a computer to simulate user interaction 
and managing session information with a website, said computer program product having: computer 
readable program code means for causing a computer to determine at least one dynamic website to 
be searched, said website having a uniform resource locator; computer readable program code 
means for causing a computer to determine a session data and document type definition, from the 
site database, for said at least one dynamic website to be searched; computer readable program 
code means for causing a computer to create a query template for a website to simulate user 
interaction, said query template containing said uniform resource locator and said document 
type definition describing how to interact with the uniform resource locator; computer readable 
program code means for causing a computer to determine at least one search topic to be searched 
on said website; computer readable program code means for causing a computer to insert said 
topic into said query template to form a search query string; computer readable program code 
means for causing a computer to query said website with said query string; computer readable 
program code means for causing a computer to receive at least one result from said query; 
computer readable program code means for causing a computer to determine if there is a second 
search topic to be searched on said website; computer readable program code means for causing a 
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computer to create a second search query string containing said uniform resource locator and 
said document type definition for said website and said second topic to be searched; computer 
readable program code means for causing a computer to execute a second query of said website 
with said second search query string; computer readable program code means for causing a 
computer to receive at least one result from said second query; computer readable program code 
means for causing a computer to execute a plurality of queries for a plurality of search topics 
to be searched on said website, 

thereby automatically simulating user interaction with said website to gather and extract 
results from said website. 
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NOVELTY - The method involves receiving and storing a number of user queries and creating a 
query template that generalizes the user queries. Directory entries answering the query 
template are retrieved so that the directory entries are stored in the cache. The directory 
entries are retrieved after estimating the benefits of storing the directory entries in the 
cache . 



USE 



Used for managing network directory cache. 



ADVANTAGE - The cache effectiveness is improved by maintaining a set of generalization of 
queries and admitting such generalizations into cache when their estimated benefits are 
sufficiently held. 

DESCRIPTION OF DRAWING (S) - The drawing shows a flow chart of processing that is performed by 
the client in creating query templates . 
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